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Abstract 

The melting curves of short heterogeneous DNA chains in solution are 
calculated on the basis of statistical thermodynamics and compared to 
experiments. The computation of the partition function is based on the 
Peyrard-Bishop hamiltonian, which has already been adopted in the theo- 
retical description of the melting of long DNA chains. In the case of short 
chains it is necessary to consider not only the breaking of the hydrogen 
bonds between single base pairs, but also the complete dissociation of the 
two strands forming the double helix. 

There is a need for a theory of the melting of short DNA chains (oligonu- 
cleotides). The melting is the highly cooperative thermal disruption of the 
hydrogen bonds between complementary bases in the double helix, as usually 
monitored by the UV absorption increment due to the unstacking of the sepa- 
rated bases [Q. At the equilibrium melting temperature half of the bonds are 
disrupted. Synthetic oligonucleotides of a fixed length and base pairs sequence 
have been used for a long time as model systems for the study of the structural 
and thermodynamical properties of the longer and more complex natural forms 
of DNA 1^. Many studies have shown the effects of both sequence and sol- 
vent composition on the melting curves of oligonucleotides in solution ||^ . More 
recently particular attention has been given to the study of sequence specific ef- 
fects on the thermal stability of a variety of specially designed oligonucleotides, 
due to their importance in the exploitation of molecular biological techniques 
in gene therapy [Q and genome mapping Predictive information has been 
gained through an extensive thermodynamical investigation on the melting be- 
havior of oligonucleotides, based on the computation of the Gibbs free energy, 
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at a fixed solvent composition, as a sum of contributions from nearest neigh- 
bors in the sequences ||^, This phenomenology and the predictive power of 
the thermodynamical approach should then be confronted with a microscopic 
theory of short, heterogeneous DNA chains. 

Modellization of DNA melting was initially motivated by the study of the 
important process of transcription, in which the double helix has to be locally 
opened to allow reading of the genetic code. It was based, already many years 
ago, on Ising-like models |^, and more recently on an approach based on the 
modified self-consistent phonon approximation (sec also and references 
therein). These methods allow only equilibrium estimates of the probability of 
bond disruption. However, it is also important to consider DNA dynamics, both 
at melting and pre-melting temperatures. There is an interest in relaxation and 
kinetic phenomena, which are relevant for the pharmacological applications Q , 
and in the study of nonlinear energy localization and transduction. With a 
particular focus on the last problem, discrete nonlinear models of DNA (see, 
e. g., [|2[ [l^, and for a review @), have been introduced; sequence effects 
have been considered in ||l5|. These models are appealing, because they are 
simplified microscopic models with a small number of degrees of freedom, and 
thus are affordable also for the simulation of very long times. The experimentally 
available melting curves offer a way to optimize the parameters of these models, 
and therefore also increase the confidence for their use in dynamical studies. 

With a particular interest in thermal stability, a dynamical model was intro- 
duced by Peyrard and Bishop in 1989 |l6j (PB model). The authors have shown, 
through statistical mechanics calculations and constant temperature molecular 
dynamics |]l6[ |l^ |l8|, applied to the case of a very long homogeneous DNA 
chain, that the model can give a satisfactory melting curve, especially after the 
improvement introduced in Q. The PB model has been successively applied 
to heterogeneous chains, either modelling the heterogeneity with a quenched 



disorder |19 , or properly choosing basis sets of orthonormal functions for the 
kernels appearing in the expression of the partition function , but compar- 
ison with experimental data was not attempted. In all these works the fact 
that the DNAs considered are quite long was essential, for the following rea- 
son. In a solution with two types of DNA single strands, A and B, there is a 
thermal equilibrium between dissociated strands and associated double strands 
(the duplexes AB), and a thermal equilibrium, in the duplexes, between broken 
and unbroken interbase hydrogen bonds. The average fraction 9 of bonded base 
pairs can then be factorized as 9 = 9ext9int [|[ H]. 9ext is the average fraction of 
strands forming duplexes, while 9int is the average fraction of unbroken bonds 
in the duplexes. The dissociation equilibrium can be neglected in the case of 
long chains, where 9int and thus go to when 9f,xt is still practically 1. On the 
contrary, in the case of short chains the processes of single bond disruption and 
strand dissociation tend to happen in the same temperature range; therefore, 
the computation of both 9int and 9ext is essential. In Ref. the factorization 
of 9 is stated, but only the case of long chains is then considered. 

The aim of this work is to show, through a comparison with experimental 
data, that the onedimensional PB model can be used to compute the melting 
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curves of short DNAs. It will also be shown how to take into account the 
dissociation equilibrium. 

The potential of the PB model is given by: 



U : 



where jji is the distance between the i-th complementary bases minus their equi- 
librium separation. The parameters k, p and a refer to the anharmonic stacking 
interaction, while the interbase bond is represented by a Morse potential, with 
depth Di and width a^. In Refs. ^ there is only a single parameter D 

because only homogeneous DNAs have been considered. The stacking interac- 
tion, that in the first attempts |l6| pT| was purely harmonic (p = 0), decreases 
when the complementary bases get farther {p positive): this p dependent non- 
linear term was found to be relevant to give cooperativity to the melting process 

To model heterogeneous DNAs, we have inserted two different values of Di, 
according to the two possible Watson-Crick base pairs: adenine-thymine (A-T) 
and guanine-cytosine (G-C). The former has two hydrogen bonds, while the 
latter has three. We have then chosen a depth for the G-C Morse potential 1.5 
times that for the A-T Morse potential. The complete set of parameter values 
that we have chosen is : fc = 0.025 eV/A^, p = 2, a = 0.35 , Dat = 0.05 
eV, Dec = 0.075 eV, qat = 4.2 A-\ qgc = 6.9 A'^ These values have been 
adjusted to reproduce the experimentally observed melting temperature of long 
homogeneous DNA in the most usual solvent conditions IT] . For a given set 
of values, the melting temperatures can be deduced with the technique of the 
transfer matrix method |l^, |l^, |l^ . 

We have then made a statistical mechanics computation, in which partition 
functions have been used to obtain both 9int and Oext- For the computation of 
6int one has to separate the configurations describing a double strand on the 
one hand, and dissociated single strands on the other. The very possibility of 
dissociation makes this a non trivial problem. We have adopted the following 
strategy. The z-th bond is considered disrupted if the value of jji is larger than 
a chosen threshold yo. We have therefore defined a configuration to belong to 
the double strand if at least one of the y^s is smaller than yo. It is then natural 
to define 9i„t for an N base pair duplex by: 

1 ^ 

where d{y) is the Heaviside step function and the canonical average < • > is de- 
fined considering only the double strand configurations. We have chosen a value 
of 2 A for j/g. After a discretization of the coordinate variables and the introduc- 
tion of a proper cutoff on the maximum value of the yiS [|l^, the computations 
needed for the canonical averages are readily reduced to the multiplication of 
finite matrices, since the potential (|^) couples only nearest neighbors, and are 
easily performed by suitable computer programs. 
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Let us now consider d^xt ■ At equilibrium the chemical potentials of the three 
species A, B and AB are related by the equation: ^iab — IJ'A — IJ-b =0. 
Using the definition of the chemical potentials as derivatives of the free energy, 
and in turn the relation of the latter to the partition functions, we obtain an 
equation involving appropriate partition functions. In the usual experimental 
conditions the solutions can be considered ideal; with the further assumption 
that the model takes into account effectively the presence of the solvent, we get 
the usual equilibrium condition: 

NabZ{A)Z{B) ^ 
NaNbZ{AB) 

where Nj is the number of molecules of species j in the volume V considered, and 
Z{j) is the partition function of a molecule of species j in V j23|. The numbers 
Nj are related by the constraints 2Nab + Na + Nb = const = 2No and AA^^ = 
ANb = —ANab- Considering the case Na = Nb (the experimental curves that 
we are presenting are made in these conditions, with the duplex obtained by 
annealing equal concentrations of A and B) , we arrive at the following expression 
for e,,t = Nab/No: 

Oext = 1 + 5- ^5-^ + 26 
where 5 is given by the following expression: 

, Z{A)Z{B) _ Z,nt{A)Z,nt{B) {A)Z,^t{B) 

2NaZ{AB) ~ aavZ.M{AB) 2NoZ,^t(AB) ' ^ ' 

where in the rightmost side we have introduced the separation of the partition 
functions in an internal and an external part j^, the meaning of aav will be 
explained in a moment. For the calculation of the internal functions, that do not 
include the overall translation of the molecules, we use the DNA model described 
above (which is also simply adapted to the description of single strands, allowing 
an analytical evaluation: only a harmonic stacking interaction remains, which 
is weaker than in the duplex, since in this case the term involving p is 0). 
We have chosen to insert in the last side of Eq. (|^) Uav = y/dATdGC to make 
separately dimensionless both fractions, that therefore can not depend on the 
choice of units. Without any such normalization the first fraction would have 
the dimensions of an inverse of a length, since the overall translation is not 
included in Zint- It is included in the external functions, that, however, have to 
take into account also the dynamics not described by the simple onedimensional 
model, and related to conformational movements (like, for example, the winding 
of the strands). This point has already been considered in Ising models: the 
influence on the dissociation process of the degrees of freedom not described by 
the model can not be neglected, and it must be accounted for in some way. In 
analogy to what has been proposed for the Ising models ||^, ^ on the basis of 
the partition functions of rigid bodies [Q, we make the following choice: 

aavZext{A)Zext{B) ^ TT^ pg.^^+g , , 

2NoZ,,t{AB) no ^ ' 



4 



where the parameters p and q can be fixed by a comparison with experimental 
melting curves; no is the single strand concentration Nq/V, and n* is a chosen 
reference concentration (we have taken 1 /iM, a usual concentration in experi- 
ments). We defer further comments about this equation after the presentation 
of the results. 

We show here the comparison of our calculations with the experimental 
melting curves that have been obtained, in our lab, for three different oligonu- 
cleotides, in a lOmM Na phosphate buffer, 0.1 mM Na2EDTA, 200 mM NaCl, 
pH 6.7. One of the oligonucleotides contained 27 base pairs, and the other two 
had 21 base pairs. The sequences are given by: 

si) s'CTTCTTATTCTTATTGTTCGTCTTCTCs/ 

52) ^'CTCTTCTCTTCTTTCTCTCTCs' 

53) ^'GTGTTAACGTGAGTATAGCGTg, 

and by the respective complementary strands. We have considered the case S3) 
at two different concentrations. The single strand concentration was: si): 2.4 
/zM, S2): 1.7 ^M, S3): 3.1 and 120 /xM. In Fig. || we show the experimental 
and computed melting curves. As it can be seen, there are sequence and concen- 
tration effects on the experimental melting curves, which are well reproduced by 
the computed curves. Note that a 40 fold concentration increase for S3) yields 
an increase of only 5 degrees in the melting temperature (a logarithmic depen- 
dence on the concentration is expected [^). Similar differences between curves 
at the low concentrations should then be due to sequence and length effects. We 
would like to stress that in the case S3) the parameters p and q have been fitted 
to the experimental curve at the lower concentration. The comparison with the 
experimental curve at the higher concentration has then been performed with 
only the change of the value of no in Eq. (|^), without changing the values of 
p and g; this has reproduced the difference between the melting temperatures 
of the two cases, that differ by about 5 degrees. This fact indicates that the 
concentration dependence of the left hand side of Eq. (||) is described by the 
preexponential factor, while the parameters p and q are related to the molecular 
conformation. 

In conclusion, our comparisons show that it is feasible to compute the equilib- 
rium melting profile of DNA oligonucleotides with the PB nonlinear model. We 
would also like to note that the modellization of the external partition functions 
ratio as in Eq. (^) is very similar to that adopted in Ising models for medium 
size DNAs (100-600 base pairs) |^. This confirms that this term is related 
to the conformational flexibility of the double and single strands, not described 
by a onedimensional model. The internal term is related to the onedimensional 
hamiltonian and then to nearest neighbor interactions. For long DNAs (large 
N), at temperatures in which Oint is already close to 0, the part in Eq. (^ 
depending on the internal partition functions goes as e^''^ for some positive 
7, and thus 5 ~ and 9ext ~ 1- This N dependence of the internal part can 
be seen, for example, in the case of homogeneous sequences with the transfer 
matrix method |l^, |l^, |l^. It is expected to be the same for heterogeneous 
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sequences. 
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Figure 1: Experimental melting profiles (full lines) and theoretical results 
(dashed lines) for the three DNA chains. We have plotted the value of (/) = 1 — 0. 
Panel a): sequence s\\ panel b): sequence S2; panel c): sequence S3 at the lower 
concentration; panel d): sequence S3 at the higher concentration. The fitted 
parameters p and q have the following values: p = 32.43 and q — 29.30 for si; 
V = 36.77 and q = 34.89 for S2; p = 29.49 and q = 27.69 for S3. 

In very short chains like ours, it is not surprising that the specific sequence 
has some influence on the parameters p and q, while in medium chains some self- 
averaging effects should already take place. In fact, as shown in the caption to 
Fig. |l| we have found differences of about 25 percent in the parameters referring 
to different sequences. We are now working on a more extended set of melting 
curves for a properly chosen set of oligonucleotides, that can help in the attempt 
to find the relation between the specific sequence and the optimized parameters. 
Then it would be possible to test the predictive power of this model and confront 
it with the predictions of purely thermodynamical calculations. 

In a more extended paper in preparation we will show a more exhaustive 
comparison with experimental curves. We will also check if a simple analysis 
based on the number of occurrences of the different intrastrand nearest neighbor 
couples in the sequences is sufficient to obtain the parameters, similarly to what 
happens in the calculations of Gibbs free energy in short oligonucleotides ^, . 

We are very grateful to F. Barone, M. Matzeu, F. Mazzei and F. Pedone for 
providing the experimental melting curves and for illuminating discussions. 
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